Domain Adaptation for Statistical Classifiers
نویسندگان
چکیده
The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the “in-domain” test data is drawn from a distribution that is related, but not identical, to the “out-of-domain” distribution of the training data. We consider the common case in which labeled out-of-domain data is plentiful, but labeled in-domain data is scarce. We introduce a statistical formulation of this problem in terms of a simple mixture model and present an instantiation of this framework to maximum entropy classifiers and their linear chain counterparts. We present efficient inference algorithms for this special case based on the technique of conditional expectation maximization. Our experimental results show that our approach leads to improved performance on three real world tasks on four different data sets from the natural language processing domain.
منابع مشابه
Bagging-based System Combination for Domain Adaptation
Domain adaptation plays an important role in multi-domain SMT. Conventional approaches usually resort to statistical classifiers, but they require annotated monolingual data in different domains, which may not be available in some cases. We instead propose a simple but effective bagging-based approach without using any annotated data. Large-scale experiments show that our new method improves tr...
متن کاملActive Learning for Cross-domain Sentiment Classification
In the literature, various approaches have been proposed to address the domain adaptation problem in sentiment classification (also called cross-domain sentiment classification). However, the adaptation performance normally much suffers when the data distributions in the source and target domains differ significantly. In this paper, we suggest to perform active learning for cross-domain sentime...
متن کاملBootstrapping polarity classifiers with rule-based classification
In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rulebased classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge ...
متن کاملAutomatic Domain Adaptation for Word Sense Disambiguation Based on Comparison of Multiple Classifiers
Domain adaptation (DA), which involves adapting a classifier developed from source to target data, has been studied intensively in recent years. However, when DA for word sense disambiguation (WSD) was carried out, the optimal DA method varied according to the properties of the source and target data. This paper proposes automatic DA based on comparing the degrees of confidence of multiple clas...
متن کاملRobust Visual Knowledge Transfer via EDA
—We address the problem of visual knowledge adaptation by leveraging labeled patterns from source domain and a very limited number of labeled instances in target domain to learn a robust classifier for visual categorization. This paper proposes a new extreme learning machine based cross-domain network learning framework, that is called Extreme Learning Machine (ELM) based Domain Adaptation (EDA...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Artif. Intell. Res.
دوره 26 شماره
صفحات -
تاریخ انتشار 2006